Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a start towards allowing arguments that aren't UTF-8. At the moment there's just the groundwork to show that argument parsing is also possible based on
&OsStr
. It's not finished yet, but I'd like to get some feedback and discuss some design decisions beforehand. The API ofgumdrop
itself has to change quite a bit, but I think users that just usederive(Options)
shouldn't need to notice much of the change.Related to #15.
What becomes
&OsStr
orAsRef<OsStr>
?The parser accepts now
AsRef<OsStr>
instead ofAsRef<str>
, but normal string (slices) implement this too, so that isn't too big of a change. All commands, short/long options stay&str
, but free arguments and arguments to long options become&OsStr
before being parsed ingumdrop_derive
.The functions
gumdrop::{parse_args, parse_args_default}
currently accept&[String]
. I think this either can be kept the same but accepting&[S] where S: AsRef<OsStr>
or two new functionsgumdrop::{parse_args_os, parse_args_os_default}
that accept&[OsString]
can be made.In
gumdrop_derive
the parsing options based on&str
can remain, and the arguments can be converted from&OsStr
to&str
first, yielding an error if that's not possible (without this PR, this would yield a panic). For this purpose I made a doc-hidden but public functiongumdrop::to_str
, I'm not sure if that's the best approach. In addition to parsing based on&str
, parsing options based on&OsStr
can be added (basically the whole point of this PR):parse(from_os_str = "...")
forfn(&OsStr) -> T
parse(try_from_os_str = "...")
forfn(&OsStr) -> Result<T, E> where E: Display
parse(from_os_str)
usesstd::convert::From::from
parse(try_from_str)
has a goodOsStr
equivalentLimitations
If available, platform-specific
OsStrExt
is used to look ahead and parse arguments starting from the beginning. Otherwise, the only API available to get the inner data of theOsStr
isto_string_lossy
, which checks the entire string for valid UTF-8 and allocates aString
if that isn't the case, replacing invalid UTF-8 with the replacement character. This means an option like--valid-utf8=invalid-utf8
cannot be parsed intoLongWithArg(&str, &OsStr)
, because there is no way to shorten the&OsStr
toinvalid-utf8
while keeping the invalid UTF-8 characters. However,--valid-utf8 invalid-utf8
(with a space instead of equal sign) can be parsed correctly on all platforms, and most platforms (especially in terms of usage) implement a form ofOsStrExt
.At the moment I crudely check with
cfg(unix)
for unix's version ofOsStrExt
, but this should be extended to other platforms that export the sameOsStrExt
but not all other unix stuff.Windows has a different
OsStrExt
trait, which cannot turn--valid-utf8=invalid-utf8
intoLongWithArg(&str, &OsStr)
, but can turn it intoLongWithArg(Cow<str>, Cow<OsStr>)
. This is because Windows uses 16-bit code points internally, which translates to that you can't shorten a&OsStr
, but you can create a newOsString
that contains only part of the original&OsStr
.